Reproducible experiments on Three-Dimensional Entity Resolution with JedAI

نویسندگان

چکیده

In Papadakis et al. (2020), we presented the latest release of JedAI, an open-source Entity Resolution (ER) system that allows for building a large variety end-to-end ER pipelines. Through thorough experimental evaluation, compared schema-agnostic pipeline based on blocks with another schema-based similarity joins. We applied them to 10 established, real-world datasets and assessed respect effectiveness time efficiency. Special care was taken juxtapose their scalability, too, using seven synthetic datasets. Moreover, experimentally batch its progressive counterpart. this companion paper, describe how reproduce entire study pertains JedAI’s serial execution through intuitive user interface. also explain examine robustness parameter configurations have selected.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

JedAI: The Force Behind Entity Resolution

We present JedAI, a toolkit for Entity Resolution that can be used in three different ways: as an open-source Java library that implements numerous state-of-the-art, domain-independent methods, as a workbench that facilitates the evaluation of their relative performance and as a desktop application that offers out-of-the-box ER solutions. JedAI bridges the gap between the database and the Seman...

متن کامل

ENCORE: Experiments with a Synthetic Entity Co-reference Resolution Tool

We present ENCORE, a system for entity co-reference resolution that synthesizes the outputs of several off-the-shelf co-reference resolution systems. To boost precision, we filter the output using a named entity recognition tool called SYNERGY which itself is a synthesis of several off-the-shelf NER systems. ENCORE is designed to work under two conditions: NP-CR which resolves noun phrase co-re...

متن کامل

MakeSense: Managing Reproducible WSNs Experiments

Wireless Sensor Networks (WSN) users often use simulation campaigns before real deployment to evaluate performance and to finetune application and network parameters. This process requires repeating the same experiments under similar conditions and to collect, parse and present data efficiently. This paper introduces MakeSense: a tool that automates this workflow and that allows reproducing sim...

متن کامل

Calibrating MoonGen for Reproducible Experiments

which are needed for high-speed operation mode, require purchasing a license. In netmap user space applications do not have direct access to the NIC’s registers. This is a safety precaution as a misconfigured NIC can crash the whole system by corrupting memory [20]. This restriction in netmap is critical as it is designed to be included in an operating system: netmap is already part of the Free...

متن کامل

Computational experiments on three-dimensional molecular diffusion in porous media

Estimations of apparent diffusion coefficients usually consist of curve-fitting the output of 1-D models to experimental laboratory-measured data from porous aggregates shaped in different forms. In this research, a computational exploration is presented on the alternative use of three-dimensional models for the same purpose. The outputs of the 3-D models were compared to results generated by o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Information Systems

سال: 2021

ISSN: ['0306-4379', '1873-6076']

DOI: https://doi.org/10.1016/j.is.2021.101830